5/8/2020

Goal of the project

  • Understand how Vancouver business’s have evolved in the past
  • How they may change in future?
  • Can this information be used to drive decision making?

Research Question

  • How long will a particular Vancouver based business stay in operation?
  • Geospatial summary of Vancouver’s business landscape

Dataset Overview

  • Licence dataset from 1997 to 2020
    • Business type
    • Location
    • Number of employees

Dataset Overview

Potential future dataset(s)

  • Access to public transportation
  • Parking space
  • Rent per sq./ft
  • Construction in the neighbourhood
  • Time taken to approve minor renovations
  • Registered capital
  • Franchise or not

Data Science Technique(s)

  • Data wrangling and database creation using Postgres
  • Baseline model:
    • Logistic regression
  • Advance modeling:
    • Random forest, Regularization
  • Further direction:
    • Survival analysis
  • Geospatial visualization:
    • Python and Altair/Leaflet
  • Deploy with Dash on Heroku

Potential difficulties

  • Identify and address any existing temporal correlation between the variables
  • Current variables may be proxies
  • Combine features from different data sources

Final product

  • A data pipeline
  • A geospatial visualization of Vancouver’s business landscape

Timeline (Week 1 - 2)

Hackathon and Proposal

Dates Deliverables Objectives
May 4 - 8 1. Prepare proposal presentation
2. Setup GitHub repository, prepare data downloading script and data dictionary
May 8 Proposal Presentation
May 11 - 15 1. Prepare proposal report
2. EDA
3. Finalize form of final product
May 12 Proposal Report to Mentor
May 15 Proposal Report to Deetken

Timeline (Week 3 - 4)

Launch Project, Develop ML Model, and Build Visualization

Dates Deliverables Objectives
May 18 - 22 1. Incorporate feedbacks from mentor and Deetken on proposal
2. Launch project (both modeling and visualization part)
May 20 Meeting with Deetken
May 25 - 29 Develop machine learning model and geo-spatial visualization

Timeline (Week 5 - 6)

Fine-tuning

Dates Deliverables Objectives
Jun 1 - 12 1. Fine-tuning of model and visualization
2. Update documentations or user manual for the end product

Timeline (Week 7 - 9)

Final Stage

Dates Deliverables Objectives
Jun 15 - 19 Prepare for final presentation
Jun 18 - 19 Final Presentation
Jun 22 - 26 Prepare final report and end data product
Jun 23 Final report and Product to Mentor Modify based on feedbacks from mentor
Jun 29 Final Report and Product to Deetken Final presentation to Deetken
Jun 30 Teamwork Reflection

Thank you!

Questions?